Bergen et al., 2021 Beyond the scope of computational modeling, the statistical power of the methods depends on the curvature in the phase portrait since a lack of curvature challenges current models to distinguish whether an up- or down-regulation is occurring. The overall curvature of deviation from the steady-state line in the phase portrait is mostly impacted by the ratios of splicing to degradation rates (Box 1), indicating that statistical inference is limited to genes where splicing is faster or comparable to degradation, while a small ratio would yield straight lines rather than an interpretable curvature.
import scvelo as scv
import numpy as np
import pandas as pd
import scipy
import scanpy as sc
import os
sc.settings.vector_friendly = False
scv.set_figure_params(figsize=(2, 5))
os.chdir('/research/peer/fdeckert/FD20200109SPLENO')
adata_file = 'data/object/scvelo.h5ad'
adata = sc.read_h5ad('data/object/velocyto.h5ad')
obs = pd.read_csv('data/object/int/meta/meta.csv', index_col=0)
obsm = pd.read_csv('data/object/int/reductions/X_umap/reduction.csv', index_col=0)
population_names = ['MPP (1)', 'Ery (1)', 'Ery (2)', 'Ery (3)', 'Ery (4)', 'Ery (5)', 'Ery (6)']
celltype_colours = [
'#FA9FB5',
'#FC9272',
'#FB6A4A',
'#EF3B2C',
'#CB181D',
'#A50F15',
'#67000D'
]
# Filter obs by Ery annotation and treatment
obs = obs[obs['leiden_annotation'].isin(population_names)]
# Filter obsm by cell index
obsm = obsm[obsm.index.isin(obs.index)]
# Filter velocity adata by obs
adata = adata[adata.obs.index.isin(obs.index)]
# Order index to match velocity adata
obs = obs.reindex(adata.obs.index)
obsm = obsm.reindex(adata.obs.index)
adata.obs = obs
adata.obsm['X_umap'] = obsm.to_numpy()
adata.uns['leiden_annotation_colors'] = celltype_colours
adata.obs['leiden_annotation'] = pd.Series(adata.obs['leiden_annotation'], dtype="category")
adata.obs['leiden_annotation'].cat.reorder_categories(population_names, inplace=True)
sc.pl.umap(adata, color=['leiden', 'tissue', 'treatment', 'leiden_annotation', 'sample_rep', 'cc_phase_class', 'pMt_RNA', 'pHb_RNA', 'pRb_RNA'], wspace=1, ncols=5)
... storing 'sample_name' as categorical ... storing 'sample_rep' as categorical ... storing 'tissue' as categorical ... storing 'treatment' as categorical ... storing 'sample_group' as categorical ... storing 'sample_path' as categorical ... storing 'sample_dir' as categorical ... storing 'cellranger_class' as categorical ... storing 'cc_phase_class' as categorical ... storing 'label_main_immgen' as categorical ... storing 'label_fine_immgen' as categorical ... storing 'label_main_haemosphere' as categorical ... storing 'label_fine_haemosphere' as categorical ... storing 'qc_class' as categorical ... storing 'doublet_cluster' as categorical ... storing 'doublet_class' as categorical ... storing 'doublet_class_int' as categorical
adata_temp = adata.copy()
adata = adata_temp.copy()
adata = adata[adata.obs['treatment']=="NaCl"]
scv.pp.filter_and_normalize(adata, min_shared_counts=20, n_top_genes=2000)
Trying to set attribute `.obs` of view, copying.
Filtered out 26496 genes that are detected 20 counts (shared). Normalized count data: X, spliced, unspliced. Extracted 2000 highly variable genes. Logarithmized X.
scv.pp.moments(adata)
computing neighbors
finished (0:00:15) --> added
'distances' and 'connectivities', weighted adjacency matrices (adata.obsp)
computing moments based on connectivities
finished (0:00:01) --> added
'Ms' and 'Mu', moments of un/spliced abundances (adata.layers)
scv.tl.recover_dynamics(adata)
recovering dynamics (using 1/32 cores)
finished (0:12:47) --> added
'fit_pars', fitted parameters for splicing dynamics (adata.var)
scv.tl.velocity(adata, mode='dynamical')
scv.tl.velocity_graph(adata)
scv.tl.latent_time(adata)
computing velocities
finished (0:00:04) --> added
'velocity', velocity vectors for each individual cell (adata.layers)
computing velocity graph (using 1/32 cores)
finished (0:00:12) --> added
'velocity_graph', sparse matrix with cosine correlations (adata.uns)
computing terminal states
WARNING: Uncertain or fuzzy root cell identification. Please verify.
identified 6 regions of root cells and 1 region of end points .
finished (0:00:01) --> added
'root_cells', root cells of Markov diffusion process (adata.obs)
'end_points', end points of Markov diffusion process (adata.obs)
computing latent time using root_cells as prior
finished (0:00:02) --> added
'latent_time', shared time (adata.obs)
scv.set_figure_params(figsize=(2, 5))
scv.pl.velocity_embedding_stream(adata, basis='X_umap', color=['leiden_annotation'])
computing velocity embedding
finished (0:00:02) --> added
'velocity_umap', embedded velocity vectors (adata.obsm)
sc.pl.umap(adata, color=['root_cells', 'end_points', 'velocity_pseudotime', 'latent_time'])
top_genes = adata.var['fit_likelihood'].sort_values(ascending=False).dropna().index
scv.tl.differential_kinetic_test(adata, var_names=top_genes, groupby='leiden_annotation')
testing for differential kinetics
finished (0:05:35) --> added
'fit_diff_kinetics', clusters displaying differential kinetics (adata.var)
'fit_pvals_kinetics', p-values of differential kinetics (adata.var)
adata.write_h5ad('scvelo_nacl.h5ad')
... storing 'fit_diff_kinetics' as categorical
scv.tl.velocity(adata, diff_kinetics=True)
scv.tl.velocity_graph(adata)
scv.tl.latent_time(adata)
computing velocities
finished (0:00:02) --> added
'velocity', velocity vectors for each individual cell (adata.layers)
computing velocity graph (using 1/32 cores)
finished (0:00:20) --> added
'velocity_graph', sparse matrix with cosine correlations (adata.uns)
computing latent time using root_cells as prior
finished (0:00:01) --> added
'latent_time', shared time (adata.obs)
scv.set_figure_params(figsize=(2, 5))
scv.pl.velocity_embedding_stream(adata, basis='X_umap', color=['leiden_annotation'])
computing velocity embedding
finished (0:00:02) --> added
'velocity_umap', embedded velocity vectors (adata.obsm)
adata = adata_temp.copy()
adata = adata[adata.obs['treatment']=="CpG"]
scv.pp.filter_and_normalize(adata, min_shared_counts=20, n_top_genes=2000)
Trying to set attribute `.obs` of view, copying.
Filtered out 26926 genes that are detected 20 counts (shared). Normalized count data: X, spliced, unspliced. Extracted 2000 highly variable genes. Logarithmized X.
scv.pp.moments(adata, use_highly_variable=True)
computing neighbors
finished (0:00:03) --> added
'distances' and 'connectivities', weighted adjacency matrices (adata.obsp)
computing moments based on connectivities
finished (0:00:02) --> added
'Ms' and 'Mu', moments of un/spliced abundances (adata.layers)
scv.tl.recover_dynamics(adata)
recovering dynamics (using 1/32 cores)
finished (0:16:35) --> added
'fit_pars', fitted parameters for splicing dynamics (adata.var)
scv.tl.velocity(adata, mode='dynamical')
scv.tl.velocity_graph(adata)
scv.tl.latent_time(adata)
computing velocities
finished (0:00:07) --> added
'velocity', velocity vectors for each individual cell (adata.layers)
computing velocity graph (using 1/32 cores)
finished (0:00:24) --> added
'velocity_graph', sparse matrix with cosine correlations (adata.uns)
computing terminal states
identified 6 regions of root cells and 2 regions of end points .
finished (0:00:04) --> added
'root_cells', root cells of Markov diffusion process (adata.obs)
'end_points', end points of Markov diffusion process (adata.obs)
computing latent time using root_cells as prior
finished (0:00:03) --> added
'latent_time', shared time (adata.obs)
scv.set_figure_params(figsize=(2, 5))
scv.pl.velocity_embedding_stream(adata, basis='X_umap', color=['leiden_annotation'])
computing velocity embedding
finished (0:00:04) --> added
'velocity_umap', embedded velocity vectors (adata.obsm)
sc.pl.umap(adata, color=['root_cells', 'end_points', 'velocity_pseudotime', 'latent_time'])
top_genes = adata.var['fit_likelihood'].sort_values(ascending=False).dropna().index
scv.tl.differential_kinetic_test(adata, var_names=top_genes, groupby='leiden_annotation')
testing for differential kinetics
finished (0:07:28) --> added
'fit_diff_kinetics', clusters displaying differential kinetics (adata.var)
'fit_pvals_kinetics', p-values of differential kinetics (adata.var)
adata.write_h5ad('scvelo_cpg.h5ad')
... storing 'fit_diff_kinetics' as categorical
scv.tl.velocity(adata, diff_kinetics=True)
scv.tl.velocity_graph(adata)
scv.tl.latent_time(adata)
computing velocities
finished (0:00:03) --> added
'velocity', velocity vectors for each individual cell (adata.layers)
computing velocity graph (using 1/32 cores)
finished (0:00:38) --> added
'velocity_graph', sparse matrix with cosine correlations (adata.uns)
computing latent time using root_cells as prior
finished (0:00:03) --> added
'latent_time', shared time (adata.obs)
scv.set_figure_params(figsize=(2, 5))
scv.pl.velocity_embedding_stream(adata, basis='X_umap', color=['leiden_annotation'])
computing velocity embedding
finished (0:00:04) --> added
'velocity_umap', embedded velocity vectors (adata.obsm)